Modeling phone duration: application to Catalan TTS

نویسندگان

  • Albert Febrer
  • Jaume Padrell
  • Antonio Bonafonte
چکیده

There are many exhaustive works that deal with the use of models for segmental duration. The aim of this paper is to evaluate some of the properties mentioned in literature and evaluate factorial and sum-of-products models in front of a listlike approach for Catalan language as a base for a most exhaustive study on duration in this language. Sum-of-products models for vowels and subsystems of consonants seem to be more adequate to model phone duration. The parameters for the sum-of-products models are presented in the paper.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Catalan vowel duration

The temporal organization of discourse has produced a great deal of works in several languages pointing to different aims: from studies where the identification of cues about the planning of linguistic message is treated to studies in which duration models for text-to-speech systems are proposed. This work is a first step towards the description of Catalan vowel duration. Considering the Catala...

متن کامل

Analysis of Duration Prediction Accuracy in HMM-Based Speech Synthesis

Appropriate phoneme durations are essential for high quality speech synthesis. In hidden Markov model-based text-tospeech (HMM-TTS), durations are typically modeled statistically using state duration probability distributions and duration prediction for unseen contexts. Use of rich context features enables synthesis without high-level linguistic knowledge. In this paper we analyze the accuracy ...

متن کامل

Modeling vowel duration for Japanese text-to-speech synthesis

Accurate estimation of segmental durations is crucial for naturalsounding text-to-speech (TTS) synthesis. This paper presents a model of vowel duration used in the Bell Labs Japanese TTS system. We describe the constraints on vowel devoicing, and effects of factors such as phone identity, surrounding phone identities, accentuation, syllabic structure, and phrasal position on the duration of bot...

متن کامل

Modeling segmental durations for Japanese text-to-speech synthesis

Accurate estimation of segmental durations is crucial for naturalsounding text-to-speech (TTS) synthesis. This paper presents a model of segmental duration used in the Bell Labs Japanese TTS system. We describe the constraints on vowel devoicing, and effects of factors such as phone identity, surrounding phone identities, accentuation, syllabic structure, and phrasal position on the duration of...

متن کامل

Modeling and Synthesizing Emotional Speech for Catalan Text-to-Speech Synthesis

This paper describes an initial approach to emotional speech synthesis in Catalan based on a diphone concatenation TTS system. The main goal of this work is to develop a simple prosodic model for expressive synthesis. This model is obtained from an emotional speech collection artificially generated by means of a copy-prosody experiment. After validating the emotional content of this collection,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998